智能论文笔记

暗物质光环的质量分布是初始密度扰动通过质量积聚和合并的层次增长的结果。我们使用一个可解释的机器学习框架来提供对暗物质光环的球形平均质量概况的起源的物理见解。我们训练梯度促进的树算法，以预测聚类大小的光环的最终质量曲线，并衡量提供给算法的不同输入的重要性。我们在初始条件（ICS）中找到了两个主要量表，它们影响最终的质量曲线：大约在Haloes的Lagrangian Patch $ r_l $（$ r \ sim 0.7 \，r_l $）的比例下的密度，并且在大型中-scale环境（$ r \ sim 1.7〜r_l $）。该模型还标识了光环组装历史记录中的三个主要时间尺度，这些时间尺度影响最终轮廓：（i）晕圈内病毒化的，折叠的材料的形成时间，（ii）动态时间，捕获动态无移动的，插入的动态时间光环的第一个轨道（iii）的组成部分是第三个，最近的时间尺度，它捕获了对最近大规模合并事件外部特征的影响。尽管内部轮廓保留了IC的内存，但仅此信息就不足以对外部轮廓产生准确的预测。当我们添加有关Haloes的质量积聚历史的信息时，我们发现所有半径的预测概况都有显着改善。我们的机器学习框架为ICS和质量组装历史的作用提供了新的见解，并在确定集群大小的光环的最终质量概况中。

translated by 谷歌翻译

Probabilistic machine learning based predictive and interpretable digital twin for dynamical systems

Tapas Tripura , Aarya Sheetal Desai , Sondipon Adhikari , Souvik Chakraborty

分类： (统计)机器学习 | 机器学习

2022-12-19

A framework for creating and updating digital twins for dynamical systems from a library of physics-based functions is proposed. The sparse Bayesian machine learning is used to update and derive an interpretable expression for the digital twin. Two approaches for updating the digital twin are proposed. The first approach makes use of both the input and output information from a dynamical system, whereas the second approach utilizes output-only observations to update the digital twin. Both methods use a library of candidate functions representing certain physics to infer new perturbation terms in the existing digital twin model. In both cases, the resulting expressions of updated digital twins are identical, and in addition, the epistemic uncertainties are quantified. In the first approach, the regression problem is derived from a state-space model, whereas in the latter case, the output-only information is treated as a stochastic process. The concepts of It\^o calculus and Kramers-Moyal expansion are being utilized to derive the regression equation. The performance of the proposed approaches is demonstrated using highly nonlinear dynamical systems such as the crack-degradation problem. Numerical results demonstrated in this paper almost exactly identify the correct perturbation terms along with their associated parameters in the dynamical system. The probabilistic nature of the proposed approach also helps in quantifying the uncertainties associated with updated models. The proposed approaches provide an exact and explainable description of the perturbations in digital twin models, which can be directly used for better cyber-physical integration, long-term future predictions, degradation monitoring, and model-agnostic control.

translated by 谷歌翻译

Explainability of a classification model is crucial when deployed in real-world decision support systems. Explanations make predictions actionable to the user and should inform about the capabilities and limitations of the system. Existing explanation methods, however, typically only provide explanations for individual predictions. Information about conditions under which the classifier is able to support the decision maker is not available, while for instance information about when the system is not able to differentiate classes can be very helpful. In the development phase it can support the search for new features or combining models, and in the operational phase it supports decision makers in deciding e.g. not to use the system. This paper presents a method to explain the qualities of a trained base classifier, called PERFormance EXplainer (PERFEX). Our method consists of a meta tree learning algorithm that is able to predict and explain under which conditions the base classifier has a high or low error or any other classification performance metric. We evaluate PERFEX using several classifiers and datasets, including a case study with urban mobility data. It turns out that PERFEX typically has high meta prediction performance even if the base classifier is hardly able to differentiate classes, while giving compact performance explanations.

translated by 谷歌翻译

肾脏是人体的重要器官。它保持体内平衡并通过尿液去除有害物质。肾细胞癌（RCC）是肾癌最常见的形式。大约90％的肾脏癌归因于RCC。最有害的RCC类型是清晰的细胞肾细胞癌（CCRCC），占所有RCC病例的80％。需要早期和准确的CCRCC检测，以防止其他器官进一步扩散该疾病。在本文中，进行了详细的实验，以确定可以在不同阶段诊断CCRCC的重要特征。 CCRCC数据集从癌症基因组图集（TCGA）获得。考虑了从8种流行特征选择方法获得的特征顺序的新型相互信息和集合的特征排名方法。通过使用2个不同的分类器（ANN和SVM）获得的总体分类精度来评估所提出方法的性能。实验结果表明，所提出的特征排名方法能够获得更高的精度（分别使用SVM和NN分别使用SVM和NN），与现有工作相比，使用SVM和NN分别使用SVM和NN进行分类。还要注意的是，在现有TNM系统（由AJCC和UICC提出的）提到的3个区分特征中，我们提出的方法能够选择其中两个（肿瘤的大小，转移状态）作为顶部 - 大多数。这确立了我们提出的方法的功效。

translated by 谷歌翻译